
\chapter {Discussion of Results} \label{discussion}

\section{First Experiment Discussion}

	Our initial hypothesis was that overspecification improves the lexeme acquisition rate. According to this hypothesis, the subjects who received overspecified references should acquire vocabulary more efficiently than those who received minimal ones. This was confirmed by our results: the overall OR lexeme acquisition rate was significantly higher than the rate of the MR condition.

	For Taxonomical and Absolute lexeme acquisition rates, we can observe that OR results were almost twice higher than MR ones. In the case of Taxonomical lexemes, the OR acquisition rate was 100\%, compared to the MR acquisition rate of only 53\%. In the case of Absolute lexemes, the acquisition rate in the OR condition was slightly lower (87.5\% for OR and 45\% for MR), which can be explained by the Primacy of Nouns \cite{gentner82}, which postulates that the acquisition of new nouns is easier because the category corresponding to nouns is conceptually simpler than those of verbs or adjectives. 

	Other objective metrics that we consider are error overcoming and success rates, which are both higher for the OR condition. By the Second Test Phrase the number of errors committed by the OR subjects dropped with a 43\% error overcoming rate, compared to only 29\% for MR subjects (See Table 2). Also, the Exercise Phase success rate for OR was 8.7\%  higher, meaning that, in average, people were better at identifying objects when overspecification was provided (See Table 3). This is a corroboration of the higher utility of the Exercise Phase with the OR condition. 

	So far we have measured that in the experiment, if the RE given was the Russian equivalent of  ``yellow chair", the MR subject had no way of inferring which object was being referred to if they did not know \emph{both} the Absolute property ``yellow" \emph{and} the Taxonomical property ``chair". However, if the RE given was ``yellow chair on the left of the red light", even if the subject did not remember the first two words, they could use the rest of the RE to \emph{infer} the object in question. Possible proof of this inference process is that it took OR subjects twice as long to resolve the REs (See Table 3). We believe that if the subjects were able to infer the referent of the RE and received the positive feedback after its identification, they could then make the connection between the previously missing properties and the properties of the object chosen, and thereby acquire new lexemes and improve their performance in the Second Test Phase.

	A final objective parameter that we examined in our analysis was whether or not the visibility of the object referred to affected the resolution of the reference. In general, visual saliency plays a key role in the resolution of REs \cite{Kelleher_2008,Kelleher_Costello_Vangenabith_2005}.  From the 178 references given when the object was in the subject’s field of vision, only 105 were successfully resolved, which is a 59\% success rate, lower than the average Exercise Phase success rate for the two groups. This means that the situatedness of the subject within the virtual world did not have such an effect on the success rate as expected. The reason for this might be that, in the context of our experiment, each subsequent reference was given when the button corresponding to the previous one was pushed and not a general view of the room. A truly indicative proof of visual saliency would therefore have to be further tested in a different experiment, for instance one in which the subject receives a reference while looking at more objects in the room.

	For subjective metrics, there are two questions in particular that are of interest to our results: whether the subjects thought that the descriptions given in Russian were appropriate, and to what extent they thought that the Exercise Room helped them remember the words learned (Q3 and Q4 in Table 4). For Q3, we found no statistically significant difference between the OR and MR conditions. That is, even though OR descriptions may be considered more ``cognitively demanding", they were not judged more difficult to understand by the subjects. Evidence that OR descriptions may be more cognitively demanding is that the resolution speed was twice as slow for OR subjects and overspecified REs included more vocabulary for subjects to process. This is also consistent with the results of Engelhardt's experiments \cite{Engelhardt_Bailey_Ferreira_2006} in which it was found that listeners ``do not judge over-descriptions to be any worse than concise expressions". 

	We found that the subject's evaluations of the utility of the Exercise Room to be significantly higher in the OR condition, which shows that the participants themselves deemed that the overspecification training exercises were more useful to them than the minimal specification exercises. Subjects from the OR condition rated the Exercise Room as being  more effective than the subjects from the MR condition (75\% compared to 86\%). Subjects perceived the training exercises as more effective when overspecified REs were used, as opposed to minimal specification REs. This is a further corroboration of our hypothesis regarding the utility of overspecification, this time in subjective terms. 

\section{Second Experiment Discussion}

	Our hypothesis for the second experiment was that the lexical acquisition rate would be higher compared to the MR condition because the subjects receive more information and more chance to `infer' the identity of objects. However, the resolution time would be quicker because the information is presented gradually, in `chunks' of information that are easier to process. This was confirmed by our results: the lexical acquisition rate was better than that of the MR subjects from the first experiment, however the resolution speed was faster than that of the OR subjects. 

	For Taxonomical and Absolute lexeme acquisition rates, we can observe a similar pattern as that of the first experiment: Taxonomical lexemes were acquired with more efficiency than Absolute lexemes, and the overall lexeme acquisition rate was in the middle. This is, once again, in line with the Primacy of Nouns theory \cite{gentner82}.

	The error overcoming rate was also significant, with 32\% of errors reduced between the First Test Phase and Second Test Phase (see Table \ref{table6}. This is a significant reduction of errors, meaning that the Exercise Phase served its purpose of aiding subjects to remember the vocabulary learned. This is also corroborated by the success rate in the Exercise Phase, which is near 63\% (see Table \ref{fig7}). 

	The overspecification rate is also a useful characteristic because it indicates in which cases subjects hesitated and needed more information to resolve an RE. The overspecification rate was 51\%, meaning that in half of the cases, the subjects were not sure about which object to choose. The extra information was given to subjects if they started moving \emph{away} from the referent. While this is not a foolproof way to measure hesitation, our reasoning was that it is the closest that we can come to doing so in a virtual environment. While in a real-life situation one would have access to the instruction follower's gestures (and requests for clarification), we did not have this in our world. Therefore we thought that misdirection could most easily be measured by movement of the subject in another direction. There was a big variety in the amount of overspecification received, ranging from 1 to 12 (out of a possible 12 REs), so we can see that it was not always necessary to give more information to subjects in order to resolve REs. Also, the average time between giving the minimal and overspecified REs was 9 seconds, meaning that it should not affect the overall resolution speed of the subjects- after receiving the extra information, subjects still had time to identify the correct referent. 


	A final objective parameter that we looked at was the role of saliency in reference resolution and, more specifically, in overspecification. While the success rate of references given when the referent was visible was similar to the average success rate (63\% vs. 64\%), the overspecification rate was almost half that of the average one (24\% vs. 51\%). This can be seen as indicative of the fact that when an object is already visible, subjects were more likely to select it as our referent, whether it be the correct one or not. This is coherent with the results of Landragin  \cite{LANDRAGIN-2001}, because a visible object would be more salient than others, and so subjects would be likely to choose it when in doubt as to which referent is the correct one. 

	If we look at the same subjective metrics as for the first experiment, we see that the lowest score was given for Q3 (see Table \ref{fig8}), which means that the hardest aspect of the task for the subjects was understanding the descriptions in Russian. However, Q4, regarding the utility of the Exercise Room, has a very high score, meaning that the subjects found that practicing the vocabulary was very useful to remembering it. This further confirms our hypothesis, this time from a subjective point of view.

\section{General Discussion}

	If we compare the results of the two experiments, we can see that our hypotheses were confirmed: giving overspecified REs to subjects in the stage of lexical acquisition helped them acquire new lexemes more successfully compared to subjects receiving minimal specification, albeit with significantly longer resolution time. However, in the second experiment, we have seen that while giving the overspecified information gradually reduces resolution time, it does not improve the subjects' performance in the experiment.

	More concretely, we can see that while resolution speed is 101 m/s in the MR condition and 50 m/s in the OR condition, in the FR condition it is between the two, at 79 m/s. A similar pattern can also be observed in Error overcoming and Lexeme acquisition rates: the performance of the FR subjects is consistently between the OR and MR conditions. The fact that the success rate for FR is significantly lower than that of the OR (65\% vs. 89\%) indicates that `chunking' the information is not always optimal; however, in the specific context of lexical acquisition, we can argue that subjects ``optimize" the quantity of information they are given; if they were given minimal information, many tried to resolve the reference right away, without waiting for the second part of the RE right away. The average success rate for minimal references in the FR condition was 73\%, when it should have been perfect; this means that in many cases, subjects did not exhibit hesitation in an ostentatious way, even though they selected the wrong referent. 

	In comparing the subject metrics (Tables \ref{fig4,fig8}), we can see that for Q1, Q2 and Q3, the answers from the OR and OR were very similar- this means that none of these subjects had trouble with the instructions or the descriptions in Russian. However, in the case of Q4 (utility of the Exercise Phase), the evaluation of the FR subjects was 10\% higher than that of the OR subjects and 20\% higher than that of MR subjects (94.2\% vs 85.8\% vs 75.1\%), meaning that FR subjects were those who felt that they profited most from the exercise room. \fxnote{elaborate more?}


















